{% extends "base.html" %} {% block body %} CKD Dataset EDA Report

Overview

Dataset statistics

Number of variables26
Number of observations400
Missing cells1009
Missing cells (%)9.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory81.4 KiB
Average record size in memory208.3 B

Variable types

Numeric11
Categorical12
Boolean3

Alerts

white_blood_cell_count has a high cardinality: 92 distinct valuesHigh cardinality
id is highly overall correlated with albumin and 5 other fieldsHigh correlation
albumin is highly overall correlated with id and 7 other fieldsHigh correlation
sugar is highly overall correlated with blood_glucose_randomHigh correlation
blood_glucose_random is highly overall correlated with sugarHigh correlation
blood _urea is highly overall correlated with serum_creatinine and 1 other fieldsHigh correlation
serum_creatinine is highly overall correlated with id and 3 other fieldsHigh correlation
sodium is highly overall correlated with albumin and 1 other fieldsHigh correlation
potassium is highly overall correlated with packed_cell_volume and 1 other fieldsHigh correlation
hemoglobine is highly overall correlated with id and 9 other fieldsHigh correlation
specific_gravety is highly overall correlated with ckdHigh correlation
red_blood_cells is highly overall correlated with id and 3 other fieldsHigh correlation
pus_cell is highly overall correlated with albumin and 4 other fieldsHigh correlation
pus_cell_clumps is highly overall correlated with pus_cellHigh correlation
packed_cell_volume is highly overall correlated with potassium and 6 other fieldsHigh correlation
red_blood_cell_count is highly overall correlated with potassium and 4 other fieldsHigh correlation
hypertension is highly overall correlated with id and 6 other fieldsHigh correlation
diabetes_mellitias is highly overall correlated with hypertensionHigh correlation
anemia is highly overall correlated with hemoglobine and 2 other fieldsHigh correlation
ckd is highly overall correlated with id and 7 other fieldsHigh correlation
pus_cell_clumps is highly imbalanced (51.2%)Imbalance
bacteria is highly imbalanced (69.0%)Imbalance
diabetes_mellitias is highly imbalanced (54.9%)Imbalance
coronary_artery_disease is highly imbalanced (70.6%)Imbalance
age has 9 (2.2%) missing valuesMissing
blood_pressure has 12 (3.0%) missing valuesMissing
specific_gravety has 47 (11.8%) missing valuesMissing
albumin has 46 (11.5%) missing valuesMissing
sugar has 49 (12.2%) missing valuesMissing
red_blood_cells has 152 (38.0%) missing valuesMissing
pus_cell has 65 (16.2%) missing valuesMissing
blood_glucose_random has 44 (11.0%) missing valuesMissing
blood _urea has 19 (4.8%) missing valuesMissing
serum_creatinine has 17 (4.2%) missing valuesMissing
sodium has 87 (21.8%) missing valuesMissing
potassium has 88 (22.0%) missing valuesMissing
hemoglobine has 52 (13.0%) missing valuesMissing
packed_cell_volume has 70 (17.5%) missing valuesMissing
white_blood_cell_count has 105 (26.2%) missing valuesMissing
red_blood_cell_count has 130 (32.5%) missing valuesMissing
id is uniformly distributedUniform
id has unique valuesUnique
albumin has 199 (49.8%) zerosZeros
sugar has 290 (72.5%) zerosZeros

Reproduction

Analysis started2022-12-28 02:45:34.298979
Analysis finished2022-12-28 02:45:58.304965
Duration24.01 seconds
Software versionpandas-profiling vv3.6.1
Download configurationconfig.json

Variables

id
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean199.5
Minimum0
Maximum399
Zeros1
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile19.95
Q199.75
median199.5
Q3299.25
95-th percentile379.05
Maximum399
Range399
Interquartile range (IQR)199.5

Descriptive statistics

Standard deviation115.6143
Coefficient of variation (CV)0.57952031
Kurtosis-1.2
Mean199.5
Median Absolute Deviation (MAD)100
Skewness0
Sum79800
Variance13366.667
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
0.2%
263 1
 
0.2%
273 1
 
0.2%
272 1
 
0.2%
271 1
 
0.2%
270 1
 
0.2%
269 1
 
0.2%
268 1
 
0.2%
267 1
 
0.2%
266 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
0 1
0.2%
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
ValueCountFrequency (%)
399 1
0.2%
398 1
0.2%
397 1
0.2%
396 1
0.2%
395 1
0.2%
394 1
0.2%
393 1
0.2%
392 1
0.2%
391 1
0.2%
390 1
0.2%

age
Real number (ℝ)

Distinct76
Distinct (%)19.4%
Missing9
Missing (%)2.2%
Infinite0
Infinite (%)0.0%
Mean51.483376
Minimum2
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum2
5-th percentile19
Q142
median55
Q364.5
95-th percentile74.5
Maximum90
Range88
Interquartile range (IQR)22.5

Descriptive statistics

Standard deviation17.169714
Coefficient of variation (CV)0.33350016
Kurtosis0.057840495
Mean51.483376
Median Absolute Deviation (MAD)10
Skewness-0.66825947
Sum20130
Variance294.79908
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60 19
 
4.8%
65 17
 
4.2%
48 12
 
3.0%
50 12
 
3.0%
55 12
 
3.0%
47 11
 
2.8%
56 10
 
2.5%
59 10
 
2.5%
45 10
 
2.5%
54 10
 
2.5%
Other values (66) 268
67.0%
ValueCountFrequency (%)
2 1
 
0.2%
3 1
 
0.2%
4 1
 
0.2%
5 2
0.5%
6 1
 
0.2%
7 1
 
0.2%
8 3
0.8%
11 1
 
0.2%
12 2
0.5%
14 1
 
0.2%
ValueCountFrequency (%)
90 1
 
0.2%
83 1
 
0.2%
82 1
 
0.2%
81 1
 
0.2%
80 4
1.0%
79 1
 
0.2%
78 1
 
0.2%
76 5
1.2%
75 5
1.2%
74 3
0.8%

blood_pressure
Real number (ℝ)

Distinct10
Distinct (%)2.6%
Missing12
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean76.469072
Minimum50
Maximum180
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum50
5-th percentile60
Q170
median80
Q380
95-th percentile100
Maximum180
Range130
Interquartile range (IQR)10

Descriptive statistics

Standard deviation13.683637
Coefficient of variation (CV)0.17894342
Kurtosis8.6460952
Mean76.469072
Median Absolute Deviation (MAD)10
Skewness1.605429
Sum29670
Variance187.24194
MonotonicityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
80 116
29.0%
70 112
28.0%
60 71
17.8%
90 53
13.2%
100 25
 
6.2%
50 5
 
1.2%
110 3
 
0.8%
140 1
 
0.2%
180 1
 
0.2%
120 1
 
0.2%
(Missing) 12
 
3.0%
ValueCountFrequency (%)
50 5
 
1.2%
60 71
17.8%
70 112
28.0%
80 116
29.0%
90 53
13.2%
100 25
 
6.2%
110 3
 
0.8%
120 1
 
0.2%
140 1
 
0.2%
180 1
 
0.2%
ValueCountFrequency (%)
180 1
 
0.2%
140 1
 
0.2%
120 1
 
0.2%
110 3
 
0.8%
100 25
 
6.2%
90 53
13.2%
80 116
29.0%
70 112
28.0%
60 71
17.8%
50 5
 
1.2%

specific_gravety
Categorical

HIGH CORRELATION  MISSING 

Distinct5
Distinct (%)1.4%
Missing47
Missing (%)11.8%
Memory size3.2 KiB
1.02
106 
1.01
84 
1.025
81 
1.015
75 
1.005
 
7

Length

Max length5
Median length4
Mean length4.4617564
Min length4

Characters and Unicode

Total characters1575
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.02
2nd row1.02
3rd row1.01
4th row1.005
5th row1.01

Common Values

ValueCountFrequency (%)
1.02 106
26.5%
1.01 84
21.0%
1.025 81
20.2%
1.015 75
18.8%
1.005 7
 
1.8%
(Missing) 47
11.8%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
1.02 106
30.0%
1.01 84
23.8%
1.025 81
22.9%
1.015 75
21.2%
1.005 7
 
2.0%

Most occurring characters

ValueCountFrequency (%)
1 512
32.5%
0 360
22.9%
. 353
22.4%
2 187
 
11.9%
5 163
 
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1222
77.6%
Other Punctuation 353
 
22.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 512
41.9%
0 360
29.5%
2 187
 
15.3%
5 163
 
13.3%
Other Punctuation
ValueCountFrequency (%)
. 353
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1575
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 512
32.5%
0 360
22.9%
. 353
22.4%
2 187
 
11.9%
5 163
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 512
32.5%
0 360
22.9%
. 353
22.4%
2 187
 
11.9%
5 163
 
10.3%

albumin
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct6
Distinct (%)1.7%
Missing46
Missing (%)11.5%
Infinite0
Infinite (%)0.0%
Mean1.0169492
Minimum0
Maximum5
Zeros199
Zeros (%)49.8%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.3526789
Coefficient of variation (CV)1.3301343
Kurtosis-0.3833766
Mean1.0169492
Median Absolute Deviation (MAD)0
Skewness0.99815724
Sum360
Variance1.8297402
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 199
49.8%
1 44
 
11.0%
2 43
 
10.8%
3 43
 
10.8%
4 24
 
6.0%
5 1
 
0.2%
(Missing) 46
 
11.5%
ValueCountFrequency (%)
0 199
49.8%
1 44
 
11.0%
2 43
 
10.8%
3 43
 
10.8%
4 24
 
6.0%
5 1
 
0.2%
ValueCountFrequency (%)
5 1
 
0.2%
4 24
 
6.0%
3 43
 
10.8%
2 43
 
10.8%
1 44
 
11.0%
0 199
49.8%

sugar
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct6
Distinct (%)1.7%
Missing49
Missing (%)12.2%
Infinite0
Infinite (%)0.0%
Mean0.45014245
Minimum0
Maximum5
Zeros290
Zeros (%)72.5%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.0991913
Coefficient of variation (CV)2.4418742
Kurtosis5.055348
Mean0.45014245
Median Absolute Deviation (MAD)0
Skewness2.4642618
Sum158
Variance1.2082214
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 290
72.5%
2 18
 
4.5%
3 14
 
3.5%
4 13
 
3.2%
1 13
 
3.2%
5 3
 
0.8%
(Missing) 49
 
12.2%
ValueCountFrequency (%)
0 290
72.5%
1 13
 
3.2%
2 18
 
4.5%
3 14
 
3.5%
4 13
 
3.2%
5 3
 
0.8%
ValueCountFrequency (%)
5 3
 
0.8%
4 13
 
3.2%
3 14
 
3.5%
2 18
 
4.5%
1 13
 
3.2%
0 290
72.5%

red_blood_cells
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)0.8%
Missing152
Missing (%)38.0%
Memory size3.2 KiB
normal
201 
abnormal
47 

Length

Max length8
Median length6
Mean length6.3790323
Min length6

Characters and Unicode

Total characters1582
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownormal
2nd rownormal
3rd rownormal
4th rownormal
5th rownormal

Common Values

ValueCountFrequency (%)
normal 201
50.2%
abnormal 47
 
11.8%
(Missing) 152
38.0%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
normal 201
81.0%
abnormal 47
 
19.0%

Most occurring characters

ValueCountFrequency (%)
a 295
18.6%
n 248
15.7%
o 248
15.7%
r 248
15.7%
m 248
15.7%
l 248
15.7%
b 47
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1582
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 295
18.6%
n 248
15.7%
o 248
15.7%
r 248
15.7%
m 248
15.7%
l 248
15.7%
b 47
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1582
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 295
18.6%
n 248
15.7%
o 248
15.7%
r 248
15.7%
m 248
15.7%
l 248
15.7%
b 47
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1582
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 295
18.6%
n 248
15.7%
o 248
15.7%
r 248
15.7%
m 248
15.7%
l 248
15.7%
b 47
 
3.0%

pus_cell
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)0.6%
Missing65
Missing (%)16.2%
Memory size3.2 KiB
normal
259 
abnormal
76 

Length

Max length8
Median length6
Mean length6.4537313
Min length6

Characters and Unicode

Total characters2162
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownormal
2nd rownormal
3rd rownormal
4th rowabnormal
5th rownormal

Common Values

ValueCountFrequency (%)
normal 259
64.8%
abnormal 76
 
19.0%
(Missing) 65
 
16.2%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
normal 259
77.3%
abnormal 76
 
22.7%

Most occurring characters

ValueCountFrequency (%)
a 411
19.0%
n 335
15.5%
o 335
15.5%
r 335
15.5%
m 335
15.5%
l 335
15.5%
b 76
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2162
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 411
19.0%
n 335
15.5%
o 335
15.5%
r 335
15.5%
m 335
15.5%
l 335
15.5%
b 76
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2162
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 411
19.0%
n 335
15.5%
o 335
15.5%
r 335
15.5%
m 335
15.5%
l 335
15.5%
b 76
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 411
19.0%
n 335
15.5%
o 335
15.5%
r 335
15.5%
m 335
15.5%
l 335
15.5%
b 76
 
3.5%

pus_cell_clumps
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.5%
Missing4
Missing (%)1.0%
Memory size3.2 KiB
notpresent
354 
present
42 

Length

Max length10
Median length10
Mean length9.6818182
Min length7

Characters and Unicode

Total characters3834
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownotpresent
2nd rownotpresent
3rd rownotpresent
4th rowpresent
5th rownotpresent

Common Values

ValueCountFrequency (%)
notpresent 354
88.5%
present 42
 
10.5%
(Missing) 4
 
1.0%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
notpresent 354
89.4%
present 42
 
10.6%

Most occurring characters

ValueCountFrequency (%)
e 792
20.7%
n 750
19.6%
t 750
19.6%
p 396
10.3%
r 396
10.3%
s 396
10.3%
o 354
9.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3834
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 792
20.7%
n 750
19.6%
t 750
19.6%
p 396
10.3%
r 396
10.3%
s 396
10.3%
o 354
9.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 3834
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 792
20.7%
n 750
19.6%
t 750
19.6%
p 396
10.3%
r 396
10.3%
s 396
10.3%
o 354
9.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3834
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 792
20.7%
n 750
19.6%
t 750
19.6%
p 396
10.3%
r 396
10.3%
s 396
10.3%
o 354
9.2%

bacteria
Categorical

Distinct2
Distinct (%)0.5%
Missing4
Missing (%)1.0%
Memory size3.2 KiB
notpresent
374 
present
 
22

Length

Max length10
Median length10
Mean length9.8333333
Min length7

Characters and Unicode

Total characters3894
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownotpresent
2nd rownotpresent
3rd rownotpresent
4th rownotpresent
5th rownotpresent

Common Values

ValueCountFrequency (%)
notpresent 374
93.5%
present 22
 
5.5%
(Missing) 4
 
1.0%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
notpresent 374
94.4%
present 22
 
5.6%

Most occurring characters

ValueCountFrequency (%)
e 792
20.3%
n 770
19.8%
t 770
19.8%
p 396
10.2%
r 396
10.2%
s 396
10.2%
o 374
9.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3894
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 792
20.3%
n 770
19.8%
t 770
19.8%
p 396
10.2%
r 396
10.2%
s 396
10.2%
o 374
9.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 3894
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 792
20.3%
n 770
19.8%
t 770
19.8%
p 396
10.2%
r 396
10.2%
s 396
10.2%
o 374
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3894
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 792
20.3%
n 770
19.8%
t 770
19.8%
p 396
10.2%
r 396
10.2%
s 396
10.2%
o 374
9.6%

blood_glucose_random
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct146
Distinct (%)41.0%
Missing44
Missing (%)11.0%
Infinite0
Infinite (%)0.0%
Mean148.03652
Minimum22
Maximum490
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum22
5-th percentile78.75
Q199
median121
Q3163
95-th percentile307.25
Maximum490
Range468
Interquartile range (IQR)64

Descriptive statistics

Standard deviation79.281714
Coefficient of variation (CV)0.53555512
Kurtosis4.2255936
Mean148.03652
Median Absolute Deviation (MAD)25
Skewness2.0107732
Sum52701
Variance6285.5902
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99 10
 
2.5%
93 9
 
2.2%
100 9
 
2.2%
107 8
 
2.0%
131 6
 
1.5%
140 6
 
1.5%
109 6
 
1.5%
92 6
 
1.5%
117 6
 
1.5%
130 6
 
1.5%
Other values (136) 284
71.0%
(Missing) 44
 
11.0%
ValueCountFrequency (%)
22 1
 
0.2%
70 5
1.2%
74 3
0.8%
75 2
 
0.5%
76 4
1.0%
78 3
0.8%
79 3
0.8%
80 2
 
0.5%
81 3
0.8%
82 3
0.8%
ValueCountFrequency (%)
490 2
0.5%
463 1
0.2%
447 1
0.2%
425 1
0.2%
424 2
0.5%
423 1
0.2%
415 1
0.2%
410 1
0.2%
380 1
0.2%
360 2
0.5%

blood _urea
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct118
Distinct (%)31.0%
Missing19
Missing (%)4.8%
Infinite0
Infinite (%)0.0%
Mean57.425722
Minimum1.5
Maximum391
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1.5
5-th percentile17
Q127
median42
Q366
95-th percentile162
Maximum391
Range389.5
Interquartile range (IQR)39

Descriptive statistics

Standard deviation50.503006
Coefficient of variation (CV)0.87944921
Kurtosis9.3452886
Mean57.425722
Median Absolute Deviation (MAD)16
Skewness2.6343745
Sum21879.2
Variance2550.5536
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46 15
 
3.8%
25 13
 
3.2%
19 11
 
2.8%
40 10
 
2.5%
15 9
 
2.2%
48 9
 
2.2%
50 9
 
2.2%
18 9
 
2.2%
32 8
 
2.0%
49 8
 
2.0%
Other values (108) 280
70.0%
(Missing) 19
 
4.8%
ValueCountFrequency (%)
1.5 1
 
0.2%
10 2
 
0.5%
15 9
2.2%
16 7
1.8%
17 7
1.8%
18 9
2.2%
19 11
2.8%
20 7
1.8%
21 1
 
0.2%
22 6
1.5%
ValueCountFrequency (%)
391 1
0.2%
322 1
0.2%
309 1
0.2%
241 1
0.2%
235 1
0.2%
223 1
0.2%
219 1
0.2%
217 1
0.2%
215 1
0.2%
208 1
0.2%

serum_creatinine
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct84
Distinct (%)21.9%
Missing17
Missing (%)4.2%
Infinite0
Infinite (%)0.0%
Mean3.0724543
Minimum0.4
Maximum76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum0.4
5-th percentile0.5
Q10.9
median1.3
Q32.8
95-th percentile11.89
Maximum76
Range75.6
Interquartile range (IQR)1.9

Descriptive statistics

Standard deviation5.7411261
Coefficient of variation (CV)1.8685798
Kurtosis79.304345
Mean3.0724543
Median Absolute Deviation (MAD)0.6
Skewness7.5095383
Sum1176.75
Variance32.960529
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.2 40
 
10.0%
1.1 24
 
6.0%
0.5 23
 
5.8%
1 23
 
5.8%
0.9 22
 
5.5%
0.7 22
 
5.5%
0.6 18
 
4.5%
0.8 17
 
4.2%
2.2 10
 
2.5%
1.5 9
 
2.2%
Other values (74) 175
43.8%
(Missing) 17
 
4.2%
ValueCountFrequency (%)
0.4 1
 
0.2%
0.5 23
5.8%
0.6 18
4.5%
0.7 22
5.5%
0.8 17
4.2%
0.9 22
5.5%
1 23
5.8%
1.1 24
6.0%
1.2 40
10.0%
1.3 8
 
2.0%
ValueCountFrequency (%)
76 1
0.2%
48.1 1
0.2%
32 1
0.2%
24 1
0.2%
18.1 1
0.2%
18 1
0.2%
16.9 1
0.2%
16.4 1
0.2%
15.2 1
0.2%
15 1
0.2%

sodium
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct34
Distinct (%)10.9%
Missing87
Missing (%)21.8%
Infinite0
Infinite (%)0.0%
Mean137.52875
Minimum4.5
Maximum163
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum4.5
5-th percentile125
Q1135
median138
Q3142
95-th percentile150
Maximum163
Range158.5
Interquartile range (IQR)7

Descriptive statistics

Standard deviation10.408752
Coefficient of variation (CV)0.075684188
Kurtosis85.53437
Mean137.52875
Median Absolute Deviation (MAD)3
Skewness-6.9965686
Sum43046.5
Variance108.34212
MonotonicityNot monotonic
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
135 40
10.0%
140 25
 
6.2%
141 22
 
5.5%
139 21
 
5.2%
142 20
 
5.0%
138 20
 
5.0%
137 19
 
4.8%
150 17
 
4.2%
136 17
 
4.2%
147 13
 
3.2%
Other values (24) 99
24.8%
(Missing) 87
21.8%
ValueCountFrequency (%)
4.5 1
 
0.2%
104 1
 
0.2%
111 1
 
0.2%
113 2
0.5%
114 2
0.5%
115 1
 
0.2%
120 2
0.5%
122 2
0.5%
124 3
0.8%
125 2
0.5%
ValueCountFrequency (%)
163 1
 
0.2%
150 17
4.2%
147 13
3.2%
146 10
 
2.5%
145 11
2.8%
144 9
 
2.2%
143 4
 
1.0%
142 20
5.0%
141 22
5.5%
140 25
6.2%

potassium
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct40
Distinct (%)12.8%
Missing88
Missing (%)22.0%
Infinite0
Infinite (%)0.0%
Mean4.6272436
Minimum2.5
Maximum47
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum2.5
5-th percentile3.4
Q13.8
median4.4
Q34.9
95-th percentile5.7
Maximum47
Range44.5
Interquartile range (IQR)1.1

Descriptive statistics

Standard deviation3.1939042
Coefficient of variation (CV)0.69023904
Kurtosis142.50591
Mean4.6272436
Median Absolute Deviation (MAD)0.5
Skewness11.582956
Sum1443.7
Variance10.201024
MonotonicityNot monotonic
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
3.5 30
 
7.5%
5 30
 
7.5%
4.9 27
 
6.8%
4.7 17
 
4.2%
4.8 16
 
4.0%
4 14
 
3.5%
4.1 14
 
3.5%
4.4 14
 
3.5%
3.9 14
 
3.5%
3.8 14
 
3.5%
Other values (30) 122
30.5%
(Missing) 88
22.0%
ValueCountFrequency (%)
2.5 2
 
0.5%
2.7 1
 
0.2%
2.8 1
 
0.2%
2.9 3
 
0.8%
3 2
 
0.5%
3.2 3
 
0.8%
3.3 3
 
0.8%
3.4 5
 
1.2%
3.5 30
7.5%
3.6 8
 
2.0%
ValueCountFrequency (%)
47 1
 
0.2%
39 1
 
0.2%
7.6 1
 
0.2%
6.6 1
 
0.2%
6.5 2
0.5%
6.4 1
 
0.2%
6.3 3
0.8%
5.9 2
0.5%
5.8 2
0.5%
5.7 4
1.0%

hemoglobine
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct115
Distinct (%)33.0%
Missing52
Missing (%)13.0%
Infinite0
Infinite (%)0.0%
Mean12.526437
Minimum3.1
Maximum17.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum3.1
5-th percentile7.9
Q110.3
median12.65
Q315
95-th percentile16.9
Maximum17.8
Range14.7
Interquartile range (IQR)4.7

Descriptive statistics

Standard deviation2.9125866
Coefficient of variation (CV)0.23251517
Kurtosis-0.47139804
Mean12.526437
Median Absolute Deviation (MAD)2.35
Skewness-0.33509468
Sum4359.2
Variance8.4831608
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 16
 
4.0%
10.9 8
 
2.0%
13.6 7
 
1.8%
13 7
 
1.8%
9.8 7
 
1.8%
11.1 7
 
1.8%
10.3 6
 
1.5%
11.3 6
 
1.5%
13.9 6
 
1.5%
12 6
 
1.5%
Other values (105) 272
68.0%
(Missing) 52
 
13.0%
ValueCountFrequency (%)
3.1 1
0.2%
4.8 1
0.2%
5.5 1
0.2%
5.6 1
0.2%
5.8 1
0.2%
6 2
0.5%
6.1 1
0.2%
6.2 1
0.2%
6.3 1
0.2%
6.6 1
0.2%
ValueCountFrequency (%)
17.8 3
0.8%
17.7 1
 
0.2%
17.6 1
 
0.2%
17.5 1
 
0.2%
17.4 2
0.5%
17.3 1
 
0.2%
17.2 2
0.5%
17.1 2
0.5%
17 4
1.0%
16.9 2
0.5%

packed_cell_volume
Categorical

HIGH CORRELATION  MISSING 

Distinct44
Distinct (%)13.3%
Missing70
Missing (%)17.5%
Memory size3.2 KiB
52
 
21
41
 
21
48
 
19
44
 
19
40
 
16
Other values (39)
234 

Length

Max length3
Median length2
Mean length2
Min length1

Characters and Unicode

Total characters660
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)3.0%

Sample

1st row44
2nd row38
3rd row31
4th row32
5th row35

Common Values

ValueCountFrequency (%)
52 21
 
5.2%
41 21
 
5.2%
48 19
 
4.8%
44 19
 
4.8%
40 16
 
4.0%
43 14
 
3.5%
42 13
 
3.2%
45 13
 
3.2%
36 12
 
3.0%
33 12
 
3.0%
Other values (34) 170
42.5%
(Missing) 70
17.5%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
52 21
 
6.4%
41 21
 
6.4%
48 19
 
5.8%
44 19
 
5.8%
40 16
 
4.8%
43 15
 
4.5%
42 13
 
3.9%
45 13
 
3.9%
32 12
 
3.6%
50 12
 
3.6%
Other values (33) 169
51.2%

Most occurring characters

ValueCountFrequency (%)
4 175
26.5%
3 129
19.5%
2 96
14.5%
5 71
10.8%
1 41
 
6.2%
0 38
 
5.8%
8 37
 
5.6%
6 28
 
4.2%
9 23
 
3.5%
7 19
 
2.9%
Other values (2) 3
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 657
99.5%
Control 2
 
0.3%
Other Punctuation 1
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 175
26.6%
3 129
19.6%
2 96
14.6%
5 71
10.8%
1 41
 
6.2%
0 38
 
5.8%
8 37
 
5.6%
6 28
 
4.3%
9 23
 
3.5%
7 19
 
2.9%
Control
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
? 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 660
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 175
26.5%
3 129
19.5%
2 96
14.5%
5 71
10.8%
1 41
 
6.2%
0 38
 
5.8%
8 37
 
5.6%
6 28
 
4.2%
9 23
 
3.5%
7 19
 
2.9%
Other values (2) 3
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 660
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 175
26.5%
3 129
19.5%
2 96
14.5%
5 71
10.8%
1 41
 
6.2%
0 38
 
5.8%
8 37
 
5.6%
6 28
 
4.2%
9 23
 
3.5%
7 19
 
2.9%
Other values (2) 3
 
0.5%

white_blood_cell_count
Categorical

HIGH CARDINALITY  MISSING 

Distinct92
Distinct (%)31.2%
Missing105
Missing (%)26.2%
Memory size3.2 KiB
9800
 
11
6700
 
10
9600
 
9
7200
 
9
9200
 
9
Other values (87)
247 

Length

Max length5
Median length4
Mean length4.2271186
Min length2

Characters and Unicode

Total characters1247
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)11.5%

Sample

1st row7800
2nd row6000
3rd row7500
4th row6700
5th row7300

Common Values

ValueCountFrequency (%)
9800 11
 
2.8%
6700 10
 
2.5%
9600 9
 
2.2%
7200 9
 
2.2%
9200 9
 
2.2%
6900 8
 
2.0%
5800 8
 
2.0%
11000 8
 
2.0%
7800 7
 
1.8%
7000 7
 
1.8%
Other values (82) 209
52.2%
(Missing) 105
26.2%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
9800 11
 
3.7%
6700 10
 
3.4%
9600 9
 
3.1%
7200 9
 
3.1%
9200 9
 
3.1%
6900 8
 
2.7%
5800 8
 
2.7%
11000 8
 
2.7%
7800 7
 
2.4%
7000 7
 
2.4%
Other values (80) 209
70.8%

Most occurring characters

ValueCountFrequency (%)
0 645
51.7%
1 99
 
7.9%
9 75
 
6.0%
6 75
 
6.0%
7 75
 
6.0%
8 67
 
5.4%
5 66
 
5.3%
2 55
 
4.4%
4 50
 
4.0%
3 36
 
2.9%
Other values (2) 4
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1243
99.7%
Control 3
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 645
51.9%
1 99
 
8.0%
9 75
 
6.0%
6 75
 
6.0%
7 75
 
6.0%
8 67
 
5.4%
5 66
 
5.3%
2 55
 
4.4%
4 50
 
4.0%
3 36
 
2.9%
Control
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
? 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1247
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 645
51.7%
1 99
 
7.9%
9 75
 
6.0%
6 75
 
6.0%
7 75
 
6.0%
8 67
 
5.4%
5 66
 
5.3%
2 55
 
4.4%
4 50
 
4.0%
3 36
 
2.9%
Other values (2) 4
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1247
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 645
51.7%
1 99
 
7.9%
9 75
 
6.0%
6 75
 
6.0%
7 75
 
6.0%
8 67
 
5.4%
5 66
 
5.3%
2 55
 
4.4%
4 50
 
4.0%
3 36
 
2.9%
Other values (2) 4
 
0.3%

red_blood_cell_count
Categorical

HIGH CORRELATION  MISSING 

Distinct49
Distinct (%)18.1%
Missing130
Missing (%)32.5%
Memory size3.2 KiB
5.2
 
18
4.5
 
16
4.9
 
14
4.7
 
11
3.9
 
10
Other values (44)
201 

Length

Max length3
Median length3
Mean length2.9518519
Min length1

Characters and Unicode

Total characters797
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)1.9%

Sample

1st row5.2
2nd row3.9
3rd row4.6
4th row4.4
5th row5

Common Values

ValueCountFrequency (%)
5.2 18
 
4.5%
4.5 16
 
4.0%
4.9 14
 
3.5%
4.7 11
 
2.8%
3.9 10
 
2.5%
4.8 10
 
2.5%
4.6 9
 
2.2%
3.4 9
 
2.2%
5.9 8
 
2.0%
5.5 8
 
2.0%
Other values (39) 157
39.2%
(Missing) 130
32.5%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
5.2 18
 
6.7%
4.5 16
 
5.9%
4.9 14
 
5.2%
4.7 11
 
4.1%
3.9 10
 
3.7%
4.8 10
 
3.7%
4.6 9
 
3.3%
3.4 9
 
3.3%
6.1 8
 
3.0%
3.7 8
 
3.0%
Other values (39) 157
58.1%

Most occurring characters

ValueCountFrequency (%)
. 263
33.0%
5 115
14.4%
4 115
14.4%
3 75
 
9.4%
6 52
 
6.5%
2 48
 
6.0%
9 34
 
4.3%
8 27
 
3.4%
7 26
 
3.3%
1 22
 
2.8%
Other values (3) 20
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 532
66.8%
Other Punctuation 264
33.1%
Control 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 115
21.6%
4 115
21.6%
3 75
14.1%
6 52
9.8%
2 48
9.0%
9 34
 
6.4%
8 27
 
5.1%
7 26
 
4.9%
1 22
 
4.1%
0 18
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 263
99.6%
? 1
 
0.4%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 797
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 263
33.0%
5 115
14.4%
4 115
14.4%
3 75
 
9.4%
6 52
 
6.5%
2 48
 
6.0%
9 34
 
4.3%
8 27
 
3.4%
7 26
 
3.3%
1 22
 
2.8%
Other values (3) 20
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 797
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 263
33.0%
5 115
14.4%
4 115
14.4%
3 75
 
9.4%
6 52
 
6.5%
2 48
 
6.0%
9 34
 
4.3%
8 27
 
3.4%
7 26
 
3.3%
1 22
 
2.8%
Other values (3) 20
 
2.5%
Distinct2
Distinct (%)0.5%
Missing2
Missing (%)0.5%
Memory size928.0 B
False
251 
True
147 
(Missing)
 
2
ValueCountFrequency (%)
False 251
62.7%
True 147
36.8%
(Missing) 2
 
0.5%

diabetes_mellitias
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)1.3%
Missing2
Missing (%)0.5%
Memory size3.2 KiB
no
258 
yes
134 
no
 
3
yes
 
2
yes
 
1

Length

Max length4
Median length2
Mean length2.3592965
Min length2

Characters and Unicode

Total characters939
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st rowyes
2nd rowno
3rd rowyes
4th rowno
5th rowno

Common Values

ValueCountFrequency (%)
no 258
64.5%
yes 134
33.5%
no 3
 
0.8%
yes 2
 
0.5%
yes 1
 
0.2%
(Missing) 2
 
0.5%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
no 261
65.6%
yes 137
34.4%

Most occurring characters

ValueCountFrequency (%)
n 261
27.8%
o 261
27.8%
y 137
14.6%
e 137
14.6%
s 137
14.6%
5
 
0.5%
1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 933
99.4%
Control 5
 
0.5%
Space Separator 1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 261
28.0%
o 261
28.0%
y 137
14.7%
e 137
14.7%
s 137
14.7%
Control
ValueCountFrequency (%)
5
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 933
99.4%
Common 6
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 261
28.0%
o 261
28.0%
y 137
14.7%
e 137
14.7%
s 137
14.7%
Common
ValueCountFrequency (%)
5
83.3%
1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 939
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 261
27.8%
o 261
27.8%
y 137
14.6%
e 137
14.6%
s 137
14.6%
5
 
0.5%
1
 
0.1%
Distinct3
Distinct (%)0.8%
Missing2
Missing (%)0.5%
Memory size3.2 KiB
no
362 
yes
 
34
no
 
2

Length

Max length3
Median length2
Mean length2.0904523
Min length2

Characters and Unicode

Total characters832
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowno
3rd rowno
4th rowno
5th rowno

Common Values

ValueCountFrequency (%)
no 362
90.5%
yes 34
 
8.5%
no 2
 
0.5%
(Missing) 2
 
0.5%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
no 364
91.5%
yes 34
 
8.5%

Most occurring characters

ValueCountFrequency (%)
n 364
43.8%
o 364
43.8%
y 34
 
4.1%
e 34
 
4.1%
s 34
 
4.1%
2
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 830
99.8%
Control 2
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 364
43.9%
o 364
43.9%
y 34
 
4.1%
e 34
 
4.1%
s 34
 
4.1%
Control
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 830
99.8%
Common 2
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 364
43.9%
o 364
43.9%
y 34
 
4.1%
e 34
 
4.1%
s 34
 
4.1%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 832
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 364
43.8%
o 364
43.8%
y 34
 
4.1%
e 34
 
4.1%
s 34
 
4.1%
2
 
0.2%

appetite
Categorical

Distinct2
Distinct (%)0.5%
Missing1
Missing (%)0.2%
Memory size3.2 KiB
good
317 
poor
82 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1596
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgood
2nd rowgood
3rd rowpoor
4th rowpoor
5th rowgood

Common Values

ValueCountFrequency (%)
good 317
79.2%
poor 82
 
20.5%
(Missing) 1
 
0.2%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
good 317
79.4%
poor 82
 
20.6%

Most occurring characters

ValueCountFrequency (%)
o 798
50.0%
g 317
 
19.9%
d 317
 
19.9%
p 82
 
5.1%
r 82
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1596
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 798
50.0%
g 317
 
19.9%
d 317
 
19.9%
p 82
 
5.1%
r 82
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1596
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 798
50.0%
g 317
 
19.9%
d 317
 
19.9%
p 82
 
5.1%
r 82
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1596
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 798
50.0%
g 317
 
19.9%
d 317
 
19.9%
p 82
 
5.1%
r 82
 
5.1%
Distinct2
Distinct (%)0.5%
Missing1
Missing (%)0.2%
Memory size928.0 B
False
323 
True
76 
(Missing)
 
1
ValueCountFrequency (%)
False 323
80.8%
True 76
 
19.0%
(Missing) 1
 
0.2%

anemia
Boolean

Distinct2
Distinct (%)0.5%
Missing1
Missing (%)0.2%
Memory size928.0 B
False
339 
True
60 
(Missing)
 
1
ValueCountFrequency (%)
False 339
84.8%
True 60
 
15.0%
(Missing) 1
 
0.2%

ckd
Categorical

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
ckd
248 
notckd
150 
ckd
 
2

Length

Max length6
Median length3
Mean length4.13
Min length3

Characters and Unicode

Total characters1652
Distinct characters7
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowckd
2nd rowckd
3rd rowckd
4th rowckd
5th rowckd

Common Values

ValueCountFrequency (%)
ckd 248
62.0%
notckd 150
37.5%
ckd 2
 
0.5%

Length

Histogram of lengths of the category

Common Values (Plot)

ValueCountFrequency (%)
ckd 250
62.5%
notckd 150
37.5%

Most occurring characters

ValueCountFrequency (%)
c 400
24.2%
k 400
24.2%
d 400
24.2%
n 150
 
9.1%
o 150
 
9.1%
t 150
 
9.1%
2
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1650
99.9%
Control 2
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 400
24.2%
k 400
24.2%
d 400
24.2%
n 150
 
9.1%
o 150
 
9.1%
t 150
 
9.1%
Control
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1650
99.9%
Common 2
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 400
24.2%
k 400
24.2%
d 400
24.2%
n 150
 
9.1%
o 150
 
9.1%
t 150
 
9.1%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 400
24.2%
k 400
24.2%
d 400
24.2%
n 150
 
9.1%
o 150
 
9.1%
t 150
 
9.1%
2
 
0.1%

Interactions

Correlations

idageblood_pressurealbuminsugarblood_glucose_randomblood _ureaserum_creatininesodiumpotassiumhemoglobinespecific_gravetyred_blood_cellspus_cellpus_cell_clumpsbacteriapacked_cell_volumewhite_blood_cell_countred_blood_cell_counthypertensiondiabetes_mellitiascoronary_artery_diseaseappetitepedal_edemaanemiackd
id1.000-0.217-0.248-0.611-0.302-0.351-0.343-0.6010.489-0.0300.6820.3820.5260.4040.2390.1560.2810.1640.2740.5490.2620.1700.3970.3320.2630.669
age-0.2171.0000.1230.2130.2810.2990.3090.350-0.1340.072-0.2300.0880.0640.1330.1800.0730.0580.1130.0370.3920.2210.1220.1290.1160.1270.243
blood_pressure-0.2480.1231.0000.1950.2170.1770.1840.305-0.1370.091-0.2760.1700.3900.2900.1570.1270.4140.1380.4320.3570.2340.0000.2540.1600.2860.306
albumin-0.6110.2130.1951.0000.3580.3720.4950.641-0.5340.053-0.6830.2870.5330.5880.4530.4100.3810.2870.4120.5490.2890.2290.3770.4740.3440.516
sugar-0.3020.2810.2170.3581.0000.6020.2230.356-0.2290.055-0.2960.1830.2130.2220.1970.1860.2330.3820.2730.3700.3050.2580.2570.1650.1450.247
blood_glucose_random-0.3510.2990.1770.3720.6021.0000.1950.359-0.2610.072-0.3490.2090.3640.3870.1760.1040.0000.3200.1910.4460.2720.1870.2500.2070.1350.318
blood _urea-0.3430.3090.1840.4950.2230.1951.0000.703-0.4140.212-0.5920.1970.3220.4080.2070.2300.4540.3080.4090.4710.1800.1980.2650.3180.4540.251
serum_creatinine-0.6010.3500.3050.6410.3560.3590.7031.000-0.4970.129-0.7260.1390.2090.2410.0000.0000.3790.4060.4770.1800.0000.0480.1410.2720.3750.099
sodium0.489-0.134-0.137-0.534-0.229-0.261-0.414-0.4971.0000.0210.5110.2320.2920.3450.2610.1600.2560.2580.3110.3640.1510.2110.2400.2160.3280.275
potassium-0.0300.0720.0910.0530.0550.0720.2120.1290.0211.000-0.0630.0390.0000.1850.0000.0000.5110.2090.5500.0590.0000.0000.0750.1350.1680.000
hemoglobine0.682-0.230-0.276-0.683-0.296-0.349-0.592-0.7260.511-0.0631.0000.3220.4890.5520.3680.2320.6900.0870.4700.6050.3600.1750.4310.4360.6900.594
specific_gravety0.3820.0880.1700.2870.1830.2090.1970.1390.2320.0390.3221.0000.4350.3850.2840.2040.2990.2640.3880.4190.2680.1280.2740.3520.2490.556
red_blood_cells0.5260.0640.3900.5330.2130.3640.3220.2090.2920.0000.4890.4351.0000.4100.0690.1480.5390.3120.4380.2890.3210.1740.2620.2820.1630.554
pus_cell0.4040.1330.2900.5880.2220.3870.4080.2410.3450.1850.5520.3850.4101.0000.5010.3110.5880.1540.5690.3720.2880.2080.3030.4030.3150.463
pus_cell_clumps0.2390.1800.1570.4530.1970.1760.2070.0000.2610.0000.3680.2840.0690.5011.0000.2520.3440.2920.3340.1770.1380.1740.1710.0770.1550.265
bacteria0.1560.0730.1270.4100.1860.1040.2300.0000.1600.0000.2320.2040.1480.3110.2521.0000.2130.4320.2550.0560.0000.1460.1250.1080.0000.174
packed_cell_volume0.2810.0580.4140.3810.2330.0000.4540.3790.2560.5110.6900.2990.5390.5880.3440.2131.0000.1500.3120.6020.3660.2220.4520.4630.6180.526
white_blood_cell_count0.1640.1130.1380.2870.3820.3200.3080.4060.2580.2090.0870.2640.3120.1540.2920.4320.1501.0000.1310.0000.3150.0000.2260.2560.2740.000
red_blood_cell_count0.2740.0370.4320.4120.2730.1910.4090.4770.3110.5500.4700.3880.4380.5690.3340.2550.3120.1311.0000.6460.3310.3990.4690.4690.5940.653
hypertension0.5490.3920.3570.5490.3700.4460.4710.1800.3640.0590.6050.4190.2890.3720.1770.0560.6020.0000.6461.0000.6020.3190.3330.3600.3360.588
diabetes_mellitias0.2620.2210.2340.2890.3050.2720.1800.0000.1510.0000.3600.2680.3210.2880.1380.0000.3660.3150.3310.6021.0000.1720.3200.3020.1650.392
coronary_artery_disease0.1700.1220.0000.2290.2580.1870.1980.0480.2110.0000.1750.1280.1740.2080.1740.1460.2220.0000.3990.3190.1721.0000.1420.1590.0000.159
appetite0.3970.1290.2540.3770.2570.2500.2650.1410.2400.0750.4310.2740.2620.3030.1710.1250.4520.2260.4690.3330.3200.1421.0000.4060.2410.404
pedal_edema0.3320.1160.1600.4740.1650.2070.3180.2720.2160.1350.4360.3520.2820.4030.0770.1080.4630.2560.4690.3600.3020.1590.4061.0000.1910.372
anemia0.2630.1270.2860.3440.1450.1350.4540.3750.3280.1680.6900.2490.1630.3150.1550.0000.6180.2740.5940.3360.1650.0000.2410.1911.0000.322
ckd0.6690.2430.3060.5160.2470.3180.2510.0990.2750.0000.5940.5560.5540.4630.2650.1740.5260.0000.6530.5880.3920.1590.4040.3720.3221.000

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

idageblood_pressurespecific_gravetyalbuminsugarred_blood_cellspus_cellpus_cell_clumpsbacteriablood_glucose_randomblood _ureaserum_creatininesodiumpotassiumhemoglobinepacked_cell_volumewhite_blood_cell_countred_blood_cell_counthypertensiondiabetes_mellitiascoronary_artery_diseaseappetitepedal_edemaanemiackd
0048.080.01.0201.00.0NaNnormalnotpresentnotpresent121.036.01.2NaNNaN15.44478005.2yesyesnogoodnonockd
117.050.01.0204.00.0NaNnormalnotpresentnotpresentNaN18.00.8NaNNaN11.3386000NaNnononogoodnonockd
2262.080.01.0102.03.0normalnormalnotpresentnotpresent423.053.01.8NaNNaN9.6317500NaNnoyesnopoornoyesckd
3348.070.01.0054.00.0normalabnormalpresentnotpresent117.056.03.8111.02.511.23267003.9yesnonopooryesyesckd
4451.080.01.0102.00.0normalnormalnotpresentnotpresent106.026.01.4NaNNaN11.63573004.6nononogoodnonockd
5560.090.01.0153.00.0NaNNaNnotpresentnotpresent74.025.01.1142.03.212.23978004.4yesyesnogoodyesnockd
6668.070.01.0100.00.0NaNnormalnotpresentnotpresent100.054.024.0104.04.012.436NaNNaNnononogoodnonockd
7724.0NaN1.0152.04.0normalabnormalnotpresentnotpresent410.031.01.1NaNNaN12.44469005noyesnogoodyesnockd
8852.0100.01.0153.00.0normalabnormalpresentnotpresent138.060.01.9NaNNaN10.83396004.0yesyesnogoodnoyesckd
9953.090.01.0202.00.0abnormalabnormalpresentnotpresent70.0107.07.2114.03.79.529121003.7yesyesnopoornoyesckd
idageblood_pressurespecific_gravetyalbuminsugarred_blood_cellspus_cellpus_cell_clumpsbacteriablood_glucose_randomblood _ureaserum_creatininesodiumpotassiumhemoglobinepacked_cell_volumewhite_blood_cell_countred_blood_cell_counthypertensiondiabetes_mellitiascoronary_artery_diseaseappetitepedal_edemaanemiackd
39039052.080.01.0250.00.0normalnormalnotpresentnotpresent99.025.00.8135.03.715.05263005.3nononogoodnononotckd
39139136.080.01.0250.00.0normalnormalnotpresentnotpresent85.016.01.1142.04.115.64458006.3nononogoodnononotckd
39239257.080.01.0200.00.0normalnormalnotpresentnotpresent133.048.01.2147.04.314.84666005.5nononogoodnononotckd
39339343.060.01.0250.00.0normalnormalnotpresentnotpresent117.045.00.7141.04.413.05474005.4nononogoodnononotckd
39439450.080.01.0200.00.0normalnormalnotpresentnotpresent137.046.00.8139.05.014.14595004.6nononogoodnononotckd
39539555.080.01.0200.00.0normalnormalnotpresentnotpresent140.049.00.5150.04.915.74767004.9nononogoodnononotckd
39639642.070.01.0250.00.0normalnormalnotpresentnotpresent75.031.01.2141.03.516.55478006.2nononogoodnononotckd
39739712.080.01.0200.00.0normalnormalnotpresentnotpresent100.026.00.6137.04.415.84966005.4nononogoodnononotckd
39839817.060.01.0250.00.0normalnormalnotpresentnotpresent114.050.01.0135.04.914.25172005.9nononogoodnononotckd
39939958.080.01.0250.00.0normalnormalnotpresentnotpresent131.018.01.1141.03.515.85368006.1nononogoodnononotckd
{% endblock %}